Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees
نویسندگان
چکیده
We propose a simple heuristic partition method (HPM) of classification tree to improve efficiency in the search for splitting points of numerical attributes. The proposal is motivated by the idea that the selection process of candidates in the splitting point selection can be made more flexible as to achieve a faster computation while retaining classification accuracy. We compare the performance of the HPM against Fayyad’s method, as the latter is the improved version of the standard C4.5 algorithm on the search of splitting points. We demonstrate that HPM is more efficient, in some cases by as much as 50%, while producing essentially the same classification for six different data sets. Our result supports the relaxation of instance boundaries (RIB) as a valid approach that can be explored to achieve more efficient computations. 2006 Elsevier Inc. All rights reserved.
منابع مشابه
A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملSupport Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کاملGeneral and Eecient Multisplitting of Numerical Attributes
Often in supervised learning numerical attributes require special treatment and do not t the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, a property that guarantees the optimal multi-partition of an arbitrary numerical domain to be deened ...
متن کاملIntegrated JIT Lot-Splitting Model with Setup Time Reduction for Different Delivery Policy using PSO Algorithm
This article develops an integrated JIT lot-splitting model for a single supplier and a single buyer. In this model we consider reduction of setup time, and the optimal lot size are obtained due to reduced setup time in the context of joint optimization for both buyer and supplier, under deterministic condition with a single product. Two cases are discussed: Single Delivery (SD) case, and Multi...
متن کاملA note on split selection bias in classification trees
A common approach to split selection in classification trees is to search through all possible splits generated by predictor variables. A splitting criterion is then used to evaluate those splits and the one with the largest criterion value is usually chosen to actually channel samples into corresponding subnodes. However, this greedy method is biased in variable selection when the numbers of t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 177 شماره
صفحات -
تاریخ انتشار 2007